Semi-Supervised Methods for Improving Keyword Search of Unseen Terms
نویسندگان
چکیده
We present a semi-supervised language modeling technique to improve search performance on terms without training data. Probabilities estimated from automatic transcripts of a large corpus of in-domain audio are added to an existing LM. Requiring neither development data or external resources, our method achieves 70% of the possible gain for manual transcription of the same audio. This is in sharp contrast to the modest gains of previous semi-supervised LM experiments. We compare the value of additional resources (labor or data) to semi-supervised learning. If human effort is available, we describe a transcription regime to efficiently close the remaining performance gap.
منابع مشابه
Improving semi-supervised deep neural network for keyword search in low resource languages
In this work, we investigate how to improve semi-supervised DNN for low resource languages where the initial systems may have high error rate. We propose using semi-supervised MLP features for DNN training, and we also explore using confidence to improve semi-supervised cross entropy and sequence training. The work conducted in this paper was evaluated under the IARPA Babel program for the keyw...
متن کاملAn Effective Path-aware Approach for Keyword Search over Data Graphs
Abstract—Keyword Search is known as a user-friendly alternative for structured languages to retrieve information from graph-structured data. Efficient retrieving of relevant answers to a keyword query and effective ranking of these answers according to their relevance are two main challenges in the keyword search over graph-structured data. In this paper, a novel scoring function is proposed, w...
متن کاملSemi-supervised Induction with Basis Functions
Considerable progress was recently made on semi-supervised learning, which differs from the traditional supervised learning by additionally exploring the information of the unlabeled examples. However, a disadvantage of many existing methods is that it does not generalize to unseen inputs. This paper suggests a space of basis functions to perform semi-supervised inductive learning. As a nice pr...
متن کاملZero-Shot Learning via Class-Conditioned Deep Generative Models
We present a deep generative model for Zero-Shot Learning (ZSL). Unlike most existing methods for this problem, that represent each class as a point (via a semantic embedding), we represent each seen/unseen class using a classspecific latent-space distribution, conditioned on class attributes. We use these latent-space distributions as a prior for a supervised variational autoencoder (VAE), whi...
متن کاملFDVQ based keyword spotter which incorporates a semi-supervised learning for primary processing
In this paper, we present a novel hybrid keyword spotting system that combines supervised and semi-supervised competitive learning algorithms. The rst stage is a S-SOM (Semi-supervised SelfOrganizing Map) module which is speci cally designed for discrimination between keywords (KWs) and non-keywords (NKWs). The second stage is an FDVQ (Fuzzy Dynamic Vector Quantization) module which consists of...
متن کامل